70 research outputs found

    Extended nominal coreference and bridging anaphora (an approach to annotation of Czech data in Prague dependency treebank)

    Get PDF
    V této práci představujeme jeden z možných modelů zpracovaní rozšířené textové koreference a asociační anafory na velkém korpusu textů, který dále používáme pro anotaci daných vztahů na textech Pražského závislostního korpusu. Na základě literatury z oblastí teorie reference, diskurzu a některých dalších poznatků teoretické lingvistiky na jedné straně a s použitím existujících anotačních metodik na straně druhé jsme vytvořili detailní klasifikaci textově koreferenčních vztahů a typů vztahů asociační anafory. V rámci textové koreference rozlišujeme dva typy textově koreferenčních vztahů - koreferenční vztah mezi jmennými frázemi se specifickou referencí a koreferenční vztah mezi jmennými frázemi s nespecifickou, především generickou referencí. Pro asociační anaforu jsme stanovili šest typů vztahů: vztah PART mezi částí a celkem, vztah SUBSET mezi množinou a podmnožinou/prvkem množiny, vztah FUNCT mezi entitou a unikátní funkcí na této entitě, vztah CONTRAST sémantického a kontextového protikladu, vztah ANAF anaforického odkazování mezi nekoreferenčními entitami a vztah REST pro jiné případy asociační anafory. Jedním z úkolů výzkumu bylo vytvořit systém teoretických principů, které je nutno dodržovat při anotaci koreferenčních vztahů a asociační anafory. V rámci tohoto systému byl zaveden například princip...The dissertation presents one of the possible models of processmg extended textual coreference and bridging anaphora in a large textual corpora, which we then use for annotation of certain relations in texts of the Prague Oependency Treebank (POT). Based, on the one hand, on the literature concerning the theory of reference, discource and some findings of theoretical linguistics, and, on the other hand, using the existing methodology of annotations, we created a detailed classification of textual coreferential relations and types of bridging anaphora. Within textual coreference, we distinguish between two types of textual coreferential relations - coreferential relations between noun phrases with specific reference and coreferential relation between noun phrases with non-specific, primarily generic, reference. We determined six types of relations for bridging anaphora: relation PART- between part and whole; relation SUBSET - between a set and a subset or element of a set; FUNCT - between an object and a unique function on that entity; CONTRAST- between semantíc and contextual opposites; relation ANAF of anaphorical referencing between noncoreferencial objects; REST- for other examples of bridging anaphora. One of the goals of the research is to create a system of theoretical principals that would be used...Institute of Czech Language and Theory of CommunicationÚstav českého jazyka a teorie komunikaceFilozofická fakultaFaculty of Art

    Translation of "It" in a Deep Syntax Framework

    Get PDF
    We present a novel approach to the translation of the English personal pronoun it to Czech. We conduct a linguistic analysis on how the distinct categories of it are usually mapped to their Czech counterparts. Armed with these observations, we design a discriminative translation model of it, which is then integrated into the TectoMT deep syntax MT framework. Features in the model take advantage of rich syntactic annotation TectoMT is based on, external tools for anaphoricity resolution, lexical co-occurrence frequencies measured on a large parallel corpus and gold coreference annotation. Even though the new model for it exhibits no improvement in terms of BLEU, manual evaluation shows that it outperforms the original solution in 8.5% sentences containing it

    Coreference chains in Czech, English and Russian: Preliminary findings

    Get PDF
    Tento článek je pilotní srovnavací výzkum koreferenčních řetězců v češtině, angličtině a ruštině. Podrobili jsme analýze 16 srovnatelných textů ve třech jazycích. Naší motivací bylo zjistit lingvistickou strukturu koreferenčních řetězců v těchto jazycích a určit, které faktory ovlivňují tuto strukturu

    Two Case Studies on Translating Pronouns in a Deep Syntax Framework

    Get PDF
    We focus on improving the translation of the English pronoun it and English reflexive pronouns in an English-Czech syntax-based machine translation framework. Our evaluation both from intrinsic and extrinsic perspective shows that adding specialized syntactic and coreference-related features leads to an improvement in trans- lation quality

    Findings of the Shared Task on Multilingual Coreference Resolution

    Full text link
    This paper presents an overview of the shared task on multilingual coreference resolution associated with the CRAC 2022 workshop. Shared task participants were supposed to develop trainable systems capable of identifying mentions and clustering them according to identity coreference. The public edition of CorefUD 1.0, which contains 13 datasets for 10 languages, was used as the source of training and evaluation data. The CoNLL score used in previous coreference-oriented shared tasks was used as the main evaluation metric. There were 8 coreference prediction systems submitted by 5 participating teams; in addition, there was a competitive Transformer-based baseline system provided by the organizers at the beginning of the shared task. The winner system outperformed the baseline by 12 percentage points (in terms of the CoNLL scores averaged across all datasets for individual languages)

    Molecular phylogeny of one extinct and two critically endangered Central Asian sturgeon species (genus Pseudoscaphirhynchus) based on their mitochondrial genomes

    Get PDF
    The enigmatic and poorly studied sturgeon genus Pseudoscaphirhynchus (Scaphirhynchinae: Acipenseridae) comprises three species: the Amu Darya shovelnose sturgeon (Pseudoscaphirhynchus kaufmanni (Bogdanow)), dwarf Amu Darya shovelnose sturgeon P. hermanni (Kessler), and Syr Darya shovelnose sturgeon (P. fedtschenkoi (Bogdanow). Two species – P. hermanni and P. kaufmanni – are critically endangered due to the Aral Sea area ecological disaster, caused by massive water use for irrigation to support cotton agriculture, subsequent pesticide pollution and habitat degradation. For another species – P. fedtschenkoi – no sightings have been reported since 1960-s and it is believed to be extinct, both in nature and in captivity. In this study, complete mitochondrial (mt) genomes of these three species of Pseudoscaphirhynchus were characterized using Illumina and Sanger sequencing platforms. Phylogenetic analyses showed the significant divergence between Amu Darya and Syr Darya freshwater sturgeons and supported the monophyletic origin of the Pseudoscaphirhynchus species. We confirmed that two sympatric Amu Darya species P. kaufmanni and P. hermanni form a single genetic cluster, which may require further morphological and genetic study to assess possible hybridization, intraspecific variation and taxonomic status and to develop conservation measures to protect these unique fishes.publishedVersio

    New insights into the human brain’s cognitive organization : Views from the top, from the bottom, from the left and, particularly, from the right

    Get PDF
    The view that the left cerebral hemisphere in humans “dominates” over the “subdominant” right hemisphere has been so deeply entrenched in neuropsychology that no amount of evidence seems able to overcome it. In this article, we examine inhibitory cause-and-effect connectivity among human brain structures related to different parts of the triune evolutionary stratification —archicortex, paleocortex and neocortex— in relation to early and late phases of a prolonged resting-state functional magnetic resonance imaging (fMRI) experiment. With respect to the evolutionarily youngest parts of the human cortex, the left and right frontopolar regions, we also provide data on the asymmetries in underlying molecular mechanisms, namely on the differential expression of the protein-coding genes and regulatory microRNA sequences. In both domains of research, our results contradict the established view by demonstrating a pronounced right-to-left vector of causation in the hemispheric interaction at multiple levels of brain organization. There may be several not mutually exclusive explanations for the evolutionary significance of this pattern of lateralization. One of the explanations emphasizes the computational advantage of separating the neural substrates for processing novel information ("exploration") mediated predominantly by the right hemisphere, and processing with reliance on established cognitive routines and representations ("exploitation") mediated predominantly by the left hemisphere.publishedVersio

    CoNLL 2017 Shared Task : Multilingual Parsing from Raw Text to Universal Dependencies

    Get PDF
    The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, one of two tasks was devoted to learning dependency parsers for a large number of languages, in a real world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe data preparation, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.Peer reviewe

    Relatório de estágio em farmácia comunitária

    Get PDF
    Relatório de estágio realizado no âmbito do Mestrado Integrado em Ciências Farmacêuticas, apresentado à Faculdade de Farmácia da Universidade de Coimbr

    Extended nominal coreference and bridging anaphora (an approach to annotation of Czech data in Prague dependency treebank)

    Get PDF
    The dissertation presents one of the possible models of processmg extended textual coreference and bridging anaphora in a large textual corpora, which we then use for annotation of certain relations in texts of the Prague Oependency Treebank (POT). Based, on the one hand, on the literature concerning the theory of reference, discource and some findings of theoretical linguistics, and, on the other hand, using the existing methodology of annotations, we created a detailed classification of textual coreferential relations and types of bridging anaphora. Within textual coreference, we distinguish between two types of textual coreferential relations - coreferential relations between noun phrases with specific reference and coreferential relation between noun phrases with non-specific, primarily generic, reference. We determined six types of relations for bridging anaphora: relation PART- between part and whole; relation SUBSET - between a set and a subset or element of a set; FUNCT - between an object and a unique function on that entity; CONTRAST- between semantíc and contextual opposites; relation ANAF of anaphorical referencing between noncoreferencial objects; REST- for other examples of bridging anaphora. One of the goals of the research is to create a system of theoretical principals that would be used..